Take-home Exercise 2: Applied Spatial Interaction Models: A case study of Singapore public bus commuter flows
1 Overview
- Chapter 16
- MPSZ contains too many catchments areas, way too coarse. Areas could be very big but the areas peole reside in could be very small.
- MAUP - modified area unit problem - to address the limitations of formal admin boundaries
- Chapter 15: low and high flows - serves as exploratory purpose to see spatial interaction
- Can we build model to explain this? Use SIM.
- Poisson regression (response variable should be in +ve and integers)
- GLM (generalised linear regression model)
- prepare 1 set of data for origin and destination (could be origin-constrained or destination-constrained)
- Passenger volume by origin (prepare flow data at analytical hexagon layer) –> HOE3
- Train station & train stn exit point: should choose either one
- Sch directory –> attractiveness: to derive school location to come up with more specific indicators (HOE3 uses population but should use attractiveness)
- Property –> total dwelling units (proxy for population) (propulsive factor)
- Biz/entertainment –> used for journey to work for model. Dont have to use all eg shops only open at 10am. Might have to remove certain categories like theatres.
- WDM (ppl go work) - attractiveness = schools, workplaces
- Calibrate different models and compapre to see which one gives the best performance
1.1 Background
Urban mobility, characterized by the daily commute of urban dwellers from homes to workplaces, presents complex challenges for transport operators and urban managers. Traditional approaches to understanding these mobility patterns, such as commuter surveys, are often hindered by high costs, time-consuming processes, and the rapid obsolescence of collected data. However, the digitalisation of city-wide urban infrastructures, including public buses, mass rapid transits, and other utilities, alongside the advent of pervasive computing technologies like GPS and SMART cards, offers a new paradigm in tracking and analyzing urban movement.
This assignment is driven by two primary motivations. First, despite the growing availability of open data for public use, there is a noticeable gap in applied research demonstrating how these diverse data sources can be effectively integrated and analyzed to inform policy-making decisions. Second, there is a need to showcase how GDSA can be utilized in practical decision-making scenarios.
The core task of this assignment is to conduct a case study that exhibits the potential value of GDSA. By leveraging publicly available data from multiple sources, the goal is to build spatial interaction models that unravel the factors influencing urban mobility patterns, particularly focusing on public bus transit. This exercise aims to bridge the gap between the abundance of geospatially-referenced data and its practical application, thereby enhancing the return on investment in data collection and management, and ultimately supporting informed policy-making in urban mobility.
1.2 Objectives
The specific tasks of this take-home exercise are as follows:
Geospatial Data Science
Derive an analytical hexagon data of 375m (this distance is the perpendicular distance between the centre of the hexagon and its edges) to represent the traffic analysis zone (TAZ).
With reference to the time intervals provided in the table below, construct an O-D matrix of commuter flows for a time interval of your choice by integrating Passenger Volume by Origin Destination Bus Stops and Bus Stop Location from LTA DataMall. The O-D matrix must be aggregated at the analytics hexagon level
Peak hour period Bus tap on time Weekday morning peak 6am to 9am Weekday afternoon peak 5pm to 8pm Weekend/holiday morning peak 11am to 2pm Weekend/holiday evening peak 4pm to 7pm Display the O-D flows of the passenger trips by using appropriate geovisualisation methods (not more than 5 maps).
Describe the spatial patterns revealed by the geovisualisation (not more than 100 words per visual).
Assemble at least three propulsive and three attractiveness variables by using aspatial and geospatial from publicly available sources.
Compute a distance matrix by using the analytical hexagon data derived earlier.
Spatial Interaction Modelling
Calibrate spatial interactive models to determine factors affecting urban commuting flows at the selected time interval.
Present the modelling results by using appropriate geovisualisation and graphical visualisation methods. (Not more than 5 visuals)
With reference to the Spatial Interaction Model output tables, maps and data visualisation prepared, describe the modelling results. (not more than 100 words per visual).
2 Loading Packages
3 Data Preparation
For the purpose of this assignment, the following data will be used:
| Type | Name | As of Date | Format | Source | |
| 1 | Aspatial | Passenger Volume by Origin Destination Bus Stops | Oct 2023 | .csv | LTA DataMall |
| 2 | Aspatial | School Directory and Information (General information of schools) | Mar 2022 | .csv | Data.gov.sg |
| 3 | Aspatial | HDB Property Information (Geocoded) | Sep 2021 | .csv | Courtesy of Prof T. S. Kam |
| 4 | Geospatial | Bus Stop Location | Jul 2023 | .shp | LTA DataMall |
| 5 | Geospatial | Train Station | Feb 2023 | .shp | LTA DataMall |
| 6 | Geospatial | Train Station Exit Point | Aug 2023 | .shp | LTA DataMall |
| 7 | Geospatial | Master Plan 2019 Subzone Boundary | 2019 | .shp | Courtesy of Prof T.S. Kam |
| 8 | Geospatial | Business (incl. industrial parks), entertn, F&B, FinServ, Leisure&Recreation and Retails (Geospatial data sets of the locations of business establishments, entertainments, food and beverage outlets, financial centres, leisure and recreation centres, retail and services stores/outlets compiled for urban mobility study) | .shp | Courtesy of Prof T.S. Kam |
3.1 O-D Data
Passenger Volume by Origin Destination Bus Stops dataset for October 2023, downloaded from LTA DataMall by using read_csv() or readr package.
glimpse() of the dplyr package allows us to see all columns and their data type in the data frame.
Rows: 5,694,297
Columns: 7
$ YEAR_MONTH <chr> "2023-10", "2023-10", "2023-10", "2023-10", "2023-…
$ DAY_TYPE <chr> "WEEKENDS/HOLIDAY", "WEEKDAY", "WEEKENDS/HOLIDAY",…
$ TIME_PER_HOUR <dbl> 16, 16, 14, 14, 17, 17, 17, 7, 14, 14, 10, 20, 20,…
$ PT_TYPE <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE <chr> "04168", "04168", "80119", "80119", "44069", "2028…
$ DESTINATION_PT_CODE <chr> "10051", "10051", "90079", "90079", "17229", "2014…
$ TOTAL_TRIPS <dbl> 3, 5, 3, 5, 4, 1, 24, 2, 1, 7, 3, 2, 5, 1, 1, 1, 1…
Observations:
- There are 7 variables in the odbus tibble data, they are:
- YEAR_MONTH: Month in which data is collected
- DAY_TYPE: Weekdays or weekends/holidays
- TIME_PER_HOUR: Hour which the passenger trip is based on, in intervals from 0 to 23 hours
- PT_TYPE: Type of public transport, i.e. bus
- ORIGIN_PT_CODE: Origin bus stop ID
- DESTINATION_PT_CODE: Destination bus stop ID
- TOTAL_TRIPS: Number of trips We also note that values in ORIGIN_PT_CODE and DESTINATON_PT_CODE are in numeric data type. These should be in factor data type for further processing and georeferencing.
as.factor() can be used to convert the variables ORIGIN_PT_CODE and DESTINATON_PT_CODE from numeric to categorical data type. We use glimpse() again to check the results.
odbus$ORIGIN_PT_CODE <- as.factor(odbus$ORIGIN_PT_CODE)
odbus$DESTINATION_PT_CODE <- as.factor(odbus$DESTINATION_PT_CODE)
glimpse(odbus)Rows: 5,694,297
Columns: 7
$ YEAR_MONTH <chr> "2023-10", "2023-10", "2023-10", "2023-10", "2023-…
$ DAY_TYPE <chr> "WEEKENDS/HOLIDAY", "WEEKDAY", "WEEKENDS/HOLIDAY",…
$ TIME_PER_HOUR <dbl> 16, 16, 14, 14, 17, 17, 17, 7, 14, 14, 10, 20, 20,…
$ PT_TYPE <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE <fct> 04168, 04168, 80119, 80119, 44069, 20281, 20281, 1…
$ DESTINATION_PT_CODE <fct> 10051, 10051, 90079, 90079, 17229, 20141, 20141, 1…
$ TOTAL_TRIPS <dbl> 3, 5, 3, 5, 4, 1, 24, 2, 1, 7, 3, 2, 5, 1, 1, 1, 1…
Note that both of them are in factor data type now.
In our study, we would like to know study the 1 of the peak hour periods identified. We will be analysing the Weekday Morning peak periods thereafter. Therefore, we can employ a combination of the following functions to obtain the relevant data:
Summary of the functions used as follow:
filter(): Retains rows that satisfies our condition (i.e. Weekday Morning peak period)
select() of dplyr package: Retains the desired variables for further analysis.
group_by() and summarise(): Aggregates the total trips at each combination of origin bus stop, destination bus stop, and peak period.
Let’s check the output using the glimpse() function of dplyr.
3.2 Geospatial Data
3.2.1 Importing Geospatial Data
For the purpose of this exercise, two geospatial data will be used. They are:
- MPSZ-2019: This data provides the sub-zone boundary of URA Master Plan 2019.
- BusStop: This data provides the location of bus stop as at Jul 2023.
- Analytical hexagon: Hexagonal grids of 375m (this distance is the perpendicular distance between the centre of the hexagon and its edges) to represent the traffic analysis zone.
In this section, we import the shapefiles into RStudio using st_read() function of sf package. st_transform() function of sf package is used to transform the projection to coordinate reference system (CRS) 3414, which is the EPSG code for the SVY21 projection used in Singapore.
Reading layer `MPSZ-2019' from data source
`C:\kytjy\ISSS624\Take-Home_Ex\Take-Home_Ex2\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
In the code chunk below, tm_shape() of tmap package is used to define the input data (i.e mpsz) and tm_polygons() is used to draw the planning subzone polygons.
Reading layer `BusStop' from data source
`C:\kytjy\ISSS624\Take-Home_Ex\Take-Home_Ex2\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
Busstop represents sf point objects for 5161 bus stop in Singapore.
To visualise the points of the bus stops, we can use tm_shape() of tmap package with each bus stop point displayed as dots. tmap_mode allows us to view static maps with plot and interactive maps with view.
Note that there are 5 bus stops located outside Singapore, they are bus stops 46239, 46609, 47701, 46211, and 46219. The code chunk below uses filter() to exclude the 5 bus stops outside Singapore.
A hexagonal grid is used to represent the traffic analysis zones, which helps to model travel demand through capturing the spatial aspects of trip origins and destinations.
Step 1: Create Hexagonal Grids
We first create a hexagonal grid layer of 375m (refers to the perpendicular distance between the centre of the hexagon and its edges) with st_make_grid, st_sf to convert the grid into an sf object with the codes below, and row_number() to assign an ID to each hexagon.
st_make_grid function is used to create a grid over a spatial object. It takes 4 arguments, they are:
x: sf object; the input spatial data
cellsize: for hexagonal cells the distance between opposite edges in the unit of the crs the spatial data is using. In this case, we take cellsize to be 325m * 2 = 650m
- what: character; one of:
"polygons","corners", or"centers" - square: indicates whether you are a square grid (TRUE) or hexagon grid (FALSE)
Step 2: Remove grids with no bus stops
We count the number of bus stops in each grid and retain only the grids with bus stops using the code chunks below.
st_intersects is used to identify the bus stops falling inside each hexagon, while lengths returns the number of bus stops inside each hexagon.
# Create a column containing the count of bus stops in each grid
area_hexagon_grid$busstops = lengths(st_intersects(area_hexagon_grid, busstop))
# Retain hexagons with bus stops
area_hexagon_grid = filter(area_hexagon_grid, busstops > 0)
sum(area_hexagon_grid$busstops, na.rm = TRUE)[1] 5156
Note that there are 5156 bus stops, which tallies to the 5156 from the raw Busstop shape file after deducting for the 5 bus stops outside Singapore boundary, suggesting that the hexagons have managed to capture all the Singapore bus stops as expected.
3.3 Geospatial Data Wrangling
3.3.1 Combining Busstop and Hexagons
Code chunk below populates the grid ID (i.e. grid_id) of hexagon_w_busstops sf data frame into busstop sf data frame using the following functions:
st_intersection()is used to perform point and polygon overly and the output will be in point sf object.select()of dplyr package is then use to retain preferred variables from the data frames.st_stop_geometry()removes the geometry data to manipulate it like a regular dataframe usingtidyranddplyrfunctions
Before we proceed, let’s perform a duplicates check on bs_wgrids.
# A tibble: 8 × 4
BUS_STOP_N BUS_ROOF_N LOC_DESC grid_id
<chr> <chr> <chr> <int>
1 43709 B06 BLK 644 788
2 43709 B06 BLK 644 788
3 58031 UNK OPP CANBERRA DR 1214
4 58031 UNK OPP CANBERRA DR 1214
5 51071 B21 MACRITCHIE RESERVOIR 1262
6 51071 B21 MACRITCHIE RESERVOIR 1262
7 97079 B14 OPP ST. JOHN'S CRES 2062
8 97079 B14 OPP ST. JOHN'S CRES 2062
Results displayed 4 seemingly genuine duplicated records, with same bus stop number, roof, and location description. We remove these to prevent double-counting.
The code chunk below helps retain unique records.
3.3.2 Populate Passenger Volume data with Grid IDs
Next, we are going to append the Grid IDs based on origin bus stops from bs_wgrids data frame onto WDMpeak data frame.
Next, we will update od_data data frame with the Grid IDs of destination bus stops.
od_data <- left_join(od_data , bs_wgrids,
by = c("DESTIN_BS" = "BUS_STOP_N")) %>%
rename(DESTIN_GRID = grid_id)
glimpse(od_data)Rows: 243,344
Columns: 9
Groups: ORIGIN_BS [5,029]
$ ORIGIN_BS <chr> "01012", "01012", "01012", "01012", "01012", "01012", "01…
$ DESTIN_BS <chr> "01112", "01113", "01121", "01211", "01311", "07371", "60…
$ TRIPS <dbl> 290, 118, 77, 118, 165, 14, 30, 16, 35, 26, 2, 8, 1, 2, 2…
$ BUS_ROOF_N.x <chr> "B03", "B03", "B03", "B03", "B03", "B03", "B03", "B03", "…
$ LOC_DESC.x <chr> "HOTEL GRAND PACIFIC", "HOTEL GRAND PACIFIC", "HOTEL GRAN…
$ ORIGIN_GRID <int> 1334, 1334, 1334, 1334, 1334, 1334, 1334, 1334, 1334, 133…
$ BUS_ROOF_N.y <chr> "B07", "B09", "B11", "B13", "B01", "B01", "B01", "B03", "…
$ LOC_DESC.y <chr> "OPP BUGIS STN EXIT C", "BUGIS STN EXIT B", "STAMFORD PR …
$ DESTIN_GRID <int> 1354, 1354, 1392, 1392, 1411, 1411, 1393, 1431, 1450, 143…
The code chunk below allows us to check for duplicates to prevent double counting. The results indicate that there are no duplicates found.
# A tibble: 0 × 9
# ℹ 9 variables: ORIGIN_BS <chr>, DESTIN_BS <chr>, TRIPS <dbl>,
# BUS_ROOF_N.x <chr>, LOC_DESC.x <chr>, ORIGIN_GRID <int>,
# BUS_ROOF_N.y <chr>, LOC_DESC.y <chr>, DESTIN_GRID <int>
The code chunk below removes rows with missing data using drop_na() and aggregates the total passenger trips at each origin-destination grid level with group_by() and summarise().
Our resulting OD Matrix is organises the commuter flows for weekday morning peak period in a column-wise format, with origin_grid representing the from and destin_grid representing the to.
4 Visualising Spatial Interaction
Origin-destination flow maps are a popular option to visualise connections between different spatial locations. It reflects the relationships/flows between locations and are created by monitoring movements. In our analysis, we can use OD flows to identify the patterns of bus ridership during weekday mornings.
4.1 Removing Intra-Zonal Flows
Intrazonal travels are considered localised and short duration trips within a transportation analysis zone (i.e. within a hexagon). For our analysis, we will be removing them.
There are 625 intra-zonal travels noted from the decrease in observations from 65,691 to 65,066.
4.2 OD Flow Distribution
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
1 2 5 11 21 38 67 125 253 660 77433
From the summary statistics above, the minimum number of passenger trips for each combination of origin and destination bus stop is 1. The maximum observed is 77,433 passengers, occurring during the weekday morning peak period. Furthermore, the 90th percentile is 174 passengers. This data suggests a highly right-skewed distribution.

4.2 Creating Flow Lines
Desire lines visually represent the connections between originating and destination hexagons using straight lines. The od2line() function of stplanr package is utilized to create these lines. The width of each desire line is proportional to number of passenger trips, i.e. thicker lines would represent higher ridership.

Too many intersecting lines causes visual clutter and obscure our analysis. Since the 90th percentile is 660, let’s take a look at the inter-zonal flows with top 10% ridership.
The map shows that Yew Tee and Woodlands dominate bus ridership during weekday mornings with wider desire lines noted. Key routes includes travelling within Yew Tee, between Woodlands Checkpoint and Woodlands MRT Station, as well as travelling within Woodlands. Boon Lay, Bedok, Choa Chu Kang, Clementi, Tampines, Pasir Ris, and Serangoon show higher concentrations and variations of desire lines with neighbouring hexagons, indicating higher ridership within these areas.
Longer desire lines between North and East (i.e. Woodlands and Changi) also suggests willingness of passengers to travel a longer distance to get to their destinations.
While OD flows provides values insights by quickly visualising travel patterns, it is beneficial to complement it with other forms of analysis such as spatial interaction models for a more comprehensive understanding of the factors affecting urban commuting flow.
5 Computing Distance Matrix
A distance matrix is a two-dimensional array containing the distances between different locations. In our analysis, we can use a distance matrix to calculate the distance passengers are willing to travel by bus to get to their destinations.
5.1 Converting from sf data.table to SpatialPolygonsDataFrame
Firstly, as.Spatial() will be used to convert area_hexagon_grid from sf tibble data frame to SpatialPolygonsDataFrame of sp object as shown in the code chunk below.
class : SpatialPolygonsDataFrame
features : 831
extent : 3595.122, 48595.12, 26049.09, 50297.8 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
variables : 2
names : grid_id, busstops
min values : 21, 1
max values : 2267, 19
5.2 Computing Distance Matrix
Next, spDists() of sp package will be used to compute the Euclidean distance between the centroids of the planning subzones.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0.000 750.000 3269.174 1500.000 2704.163 3968.627 1299.038 2250.000
[2,] 750.000 0.000 2598.076 750.000 1984.313 3269.174 750.000 1500.000
[3,] 3269.174 2598.076 0.000 1984.313 750.000 750.000 2704.163 1500.000
[4,] 1500.000 750.000 1984.313 0.000 1299.038 2598.076 750.000 750.000
[5,] 2704.163 1984.313 750.000 1299.038 0.000 1299.038 1984.313 750.000
[6,] 3968.627 3269.174 750.000 2598.076 1299.038 0.000 3269.174 1984.313
[7,] 1299.038 750.000 2704.163 750.000 1984.313 3269.174 0.000 1299.038
[8,] 2250.000 1500.000 1500.000 750.000 750.000 1984.313 1299.038 0.000
The resulting output is a matrix object class, but column headers and row headers are not labeled with the grid IDs. We rename the headers for clarity.
5.3 Pivoting Distance Value by Grid ID
Next, we will pivot the distance matrix into a long table by using the row and column grid IDs as show in the code chunk below.
Notice that the within zone distance is 0.
5.4 Updating Intra-Zonal Distances
In this section, we are going to append a constant value to replace the intra-zonal distance of 0.
First, we will select and find out the minimum value of the distance by using summary().
orig dest dist
Min. : 21 Min. : 21 Min. : 750
1st Qu.: 789 1st Qu.: 789 1st Qu.: 8250
Median :1200 Median :1200 Median :13269
Mean :1150 Mean :1150 Mean :14135
3rd Qu.:1529 3rd Qu.:1529 3rd Qu.:18929
Max. :2267 Max. :2267 Max. :44680
Next, an arbitrary constant distance value of 100m is added into intra-zones distance.
The code chunk below will be used to check the result data.frame.
orig dest dist
Min. : 21 Min. : 21 Min. : 100
1st Qu.: 789 1st Qu.: 789 1st Qu.: 8250
Median :1200 Median :1200 Median :13269
Mean :1150 Mean :1150 Mean :14119
3rd Qu.:1529 3rd Qu.:1529 3rd Qu.:18929
Max. :2267 Max. :2267 Max. :44680
Lastly, the code chunk below is used to save the dataframe for future use.
5.5 Combining passenger volume data with distance value
6. Preparing Origin and Destination Attributes
The following information is used to derive propulsive/attractiveness variables: 1. Business, entertn, F&B, FinServ, Leisure&Recreation and Retails are geospatial data sets of the locations of business establishments, entertainments, food and beverage outlets, financial centres, leisure and recreation centres, retail and services stores/outlets.
Schools: This data set contains directory and general information of schools in Singapore, obtained from data.gov.
HDB: This data set is the geocoded version of HDB Property Information data from data.gov. The data set is prepared using September 2021 data.
trainstationexits contains the MRT station names and exits along with their respective point geometries in CRS SVY21.
Train stations exits reflect the intermodal connections with bus stops. In the context of attractiveness, these stations can be seen as destinations that attract passengers, including those who might transit to/from these stations by bus. The data can also indicate the propulsiveness aspect – how these stations act as origins for passengers who leave the MRT stations and then proceed to their final destinations via other modes of transportation like buses.
Step 1: Import shapefile
st_read() function of the sf package enables us to import the file into RStudio.
Show the code
Reading layer `Train_Station_Exit_layer' from data source
`C:\kytjy\ISSS624\Take-Home_Ex\Take-Home_Ex2\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 565 features and 2 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 6134.086 ymin: 27499.7 xmax: 45356.36 ymax: 47865.92
Projected CRS: SVY21
Notice there are 565 train station exits in total. Let’s visualise where these are located are using the code chunk below:
Step 2: Point-in-Polygon Count Process
Next, we will count the number of train station exits located inside each hexagon.
area_hexagon_grid$`COUNT_TRAINSTATIONEXIT`<- lengths(
st_intersects(
area_hexagon_grid, trainstationexits))
sum(area_hexagon_grid$COUNT_TRAINSTATIONEXIT)[1] 560
The 5 train station exits not accounted for could be in areas outside hexagons where there are no bus stop.
Summary statistics indicate that a maximum of 13 train station exits are located within a single hexagon, while at least 75% of the hexagons do not contain any exits.
business contains the details of various businesses from SMEs to bigger groups like Pan Pacific, as well as the respective point geometries in CRS SVY21.
Businesses serve as significant attractors in a city. They draw people to these locations, primarily for work purposes. The presence and density of businesses in an area can significantly influence the flow of commuters, making it a measure of attractiveness.
Step 1: Import shapefile
Show the code
Reading layer `Business' from data source
`C:\kytjy\ISSS624\Take-Home_Ex\Take-Home_Ex2\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 6550 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3669.148 ymin: 25408.41 xmax: 47034.83 ymax: 50148.54
Projected CRS: SVY21 / Singapore TM
Note that there are 6550 business in our dataset in total. Let’s visualise where these are located are using the code chunk below:
Step 2: Point-in-Polygon Count Process
Next, we will count the number of business in each hexagon.
Less than half of the hexagons contain no businesses, while in contrast, a single hexagon houses as many as 97 businesses.
finserv contains the details of various financial centres, as well as the respective point geometries in CRS SVY21.
Financial service locations often represent significant employment centers, especially in urban and commercial areas. Many people travel to these locations for work, making them important attractors during morning peak hours. In addition, financial services typically adhere to standard office hours, which aligns well with the morning peak period of bus ridership, as a large proportion of employees would be traveling to work during this time.
Step 1: Import shapefile
Show the code
Reading layer `FinServ' from data source
`C:\kytjy\ISSS624\Take-Home_Ex\Take-Home_Ex2\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 3320 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 4881.527 ymin: 25171.88 xmax: 46526.16 ymax: 49338.02
Projected CRS: SVY21 / Singapore TM
Note that there are 3320 locations of financial services in our dataset in total. Let’s visualise where these are located are using the code chunk below:
Step 2: Point-in-Polygon Count Process
Next, we will count the number of financial services in each hexagon.
The summary statistics reveal that up to 139 financial services locations can be found within a single hexagon, with less than half of the hexagons are devoid of any such locations.
recs includes information on various leisure and recreation centers, such as playgrounds, parks, and fitness centers. It also contains their respective point geometries in the CRS SVY21.
Recreational facilities could be popular for early morning workouts and might attract a morning crowd.
Step 1: Import shapefile
Show the code
Reading layer `Liesure&Recreation' from data source
`C:\kytjy\ISSS624\Take-Home_Ex\Take-Home_Ex2\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 1217 features and 30 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 6010.495 ymin: 25134.28 xmax: 48439.77 ymax: 50078.88
Projected CRS: SVY21 / Singapore TM
Note that there are 1217 locations of leisure and recreational centres in our dataset in total. Let’s visualise where these are located are using the code chunk below:
Step 2: Point-in-Polygon Count Process
Next, we will count the number of facilities in each hexagon.
The summary statistics indicate that, on average, a single leisure and recreational facility is found within each hexagon, although the highest number recorded in a hexagon is 41.
retail includes information on various retail and services stores/outlets, along with their respective point geometries in the CRS SVY21.
Retail locations can be significant attractors in the morning, particularly as employment destinations for people who work in these retail and service outlets. Some retail services, like coffee shops, breakfast spots, and convenience stores, might attract early morning customers, including commuters heading to work.
Step 1: Import shapefile
Show the code
Reading layer `Retails' from data source
`C:\kytjy\ISSS624\Take-Home_Ex\Take-Home_Ex2\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 37635 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 4737.982 ymin: 25171.88 xmax: 48265.04 ymax: 50135.28
Projected CRS: SVY21 / Singapore TM
Note that there are 37635 retail locations in our dataset in total. Let’s visualise where these are located are using the code chunk below:
Step 2: Point-in-Polygon Count Process
Next, we will count the number of retail centres in each hexagon.
The summary statistics reveal that, on average, 44 retail and service centers can be located within a single hexagon, with the maximum number reaching 1678.
Reference
epsg.io (2023). EPSG: 3414 SVY21 / Singapore TM. https://epsg.io/3414
Miller, E. J. (2021). Traffic Analysis Zone Definition: Issues & Guidance. Travel Modelling Group. https://tmg.utoronto.ca/files/Reports/Traffic-Zone-Guidance_March-2021_Final.pdf




